Designing discriminative powerful texture features robust to realisticimaging conditions is a challenging computer vision problem with manyapplications, including material recognition and analysis of satellite oraerial imagery. In the past, most texture description approaches were based ondense orderless statistical distribution of local features. However, mostrecent approaches to texture recognition and remote sensing sceneclassification are based on Convolutional Neural Networks (CNNs). The d factopractice when learning these CNN models is to use RGB patches as input withtraining performed on large amounts of labeled data (ImageNet). In this paper,we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trainedusing mapped coded images with explicit texture information providecomplementary information to the standard RGB deep models. Additionally, twodeep architectures, namely early and late fusion, are investigated to combinethe texture and color information. To the best of our knowledge, we are thefirst to investigate Binary Patterns encoded CNNs and different deep networkfusion architectures for texture recognition and remote sensing sceneclassification. We perform comprehensive experiments on four texturerecognition datasets and four remote sensing scene classification benchmarks:UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with7 categories and the recently introduced large scale aerial image dataset (AID)with 30 aerial scene types. We demonstrate that TEX-Nets provide complementaryinformation to standard RGB deep model of the same network architecture. Ourlate fusion TEX-Net architecture always improves the overall performancecompared to the standard RGB network on both recognition problems. Our finalcombination outperforms the state-of-the-art without employing fine-tuning orensemble of RGB network architectures.
展开▼